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Korea has a relatively short history in the development and use of clinical practice 
guidelines (CPGs). Additionally, it has been difficult to employ the Appraisal of Guidelines 
for Research and Evaluation (AGREE) II instrument due to the lack of consensus and the 
presence of differences in Korean medical settings and in the Korean socio-cultural 
environment. An AGREE II scoring guide was therefore developed to reduce differences 
among evaluators using the same tool. In consideration of the importance of using a 
quantitative measure of satisfaction with the elements described in the AGREE II manual, 
a final draft was developed through a Delphi consensus process. Ninety-two draft scoring 
guides for anchor points 1,3,5, and 7 (full score) in 23 items were developed. Consensus 
was defined as agreement among at least 70% of the raters. Agreement on 88 draft 
scoring guidelines was reached in the first Delphi round, and agreement for the remaining 
four was achieved in the second round. The development of an AGREE II scoring guide in 
this study is expected to contribute to improving the CPG environment. 
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INTRODUCTION 

The development of clinical practice guidelines (CPGs) in West- 
ern countries started in the 1980s and increased rapidly in the 
1990s. The number of publications related to CPGs recently ex- 
ceeded 1,000 per year (1). In Korea, an increased interest and 
emphasis on the development of CPGs has been presented in 
the healthcare community (2). About 54 CPGs has been devel- 
oped in Korea according to the survey of the Korean Academy 
of Medical Sciences (KAMS) in 2006 (3). However, only 33 CPGs 
offer full-text service at this point in time on Korean Medical 
Guideline Information Center (3). And the methodology and 
content of Korean CPGs are not evaluated as systematically as 
are those in Western countries (2, 4). A survey of the recognition 
and use of CPGs found that 73.4% of respondents recognized 



the existence of CPGs, but only 53.3% used the CPGs in actual 
practice (5). These results reflect the short history of CPGs in 
Korea and the absence of a consensus about the development 
and use of these CPGs. 

The Appraisal of Guidelines for Research and Evaluation 
(AGREE) was first introduced in 2003, and the supplemented 
the AGREE II instrument (6), a 23-item instrument addressing 
six quality domains, was published in 2009; it is the most widely 
used quality evaluation tool for CPGs (7). In Korea, the original 
AGREE instrument, which was a translation, was distributed in 
2006 (8); the AGREE II instrument was translated into Korean 
and introduced to Korean medical societies in 2010 (9). This 
evaluation tool is used not only to evaluate the quality of medi- 
cal practice guidelines but also to contribute to the development 
of a method for refining CPGs and providing information rele- 
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vant to the content and wording of CPGs (6). 

We do not have as much experience as do those in Western 
countries with the development of CPGs or the use of the AGREE 
instrument. One reason for the difficulties in the application of 
the AGREE instrument is lack of consensus, and a report issued 
by the KAMS in 2009 noted that efforts to reduce misunder- 
standings about AGREE items among evaluators are needed 

(5) . The KAMS Executive Committee for CPGs (ECC) discussed 
the possibility of developing an AGREE II instrument scoring 
guide to resolve ambiguities. Accordingly, the ECC sought to 
develop Korean-style criteria that considered differences in the 
medical environments and cultures of Korea and the West while 
not sacrificing the need for "how-to" information regarding rat- 
ing each item on the AGREE II. The ECC attempted to propose 
a methodology for and provide information about optimal edit- 
ing practices as well as to facilitate achievement of the original 
goals of AGREE II with respect to evaluating the quality of CPGs. 

MATERIALS AND METHODS 

The ECC is an expert panel established to develop CPGs and 
evaluate guidelines using the AGREE instrument; this body is 
responsible for developing a domestic CPG evaluation system, 
providing an educational program for CPG developers, and dis- 
seminating information about the methodology used to develop 
CPGs. Each item on the AGREE II specifies a goal, a method of 
description, and the elements required to meet that goal. These 
are presented under the following headings: "User's Manual 
Description," "Where to Look," and "Howto Rate," respectively 

(6) . Those charged with evaluating the guidelines should thor- 
oughly understand all content before grading the CPGs. These 
individuals rate the guidelines from 7 to 1 according to their sat- 
isfaction with the requirements. Based on the sections address- 
ing "Where to Look" and "Howto Rate," the ECC defined a rat- 
ing of "7" as "strongly agree" and proposed four anchor points 
(1, 3, 5, and 7); ratings of 2, 4, and 6 could be given at the discre- 
tion of the peer reviewer. In terms of the three anchor points 
and the full score, a draft that addressed the importance of and 
quantitatively measured satisfaction with the elements and the 
final document emerged from a Delphi consensus process. 

Development of the draft scoring guide 

A focus group consisting of an internist, a urologist, a clinical 
pathologist, a psychiatrist, a family physician, and an evidence- 
based medicine (EBM) methodologist, all of whom had great 
interest in and experience with CPGs and the AGREE instru- 
ment, was established. This group generated the draft scoring 
guide for scores of 1, 3, 5, and 7 for each of the AGREE II items. 

Modified Delphi consensus process 

We used a modified Delphi consensus process to avoid domi- 



nation by individual views in open discussions. Based on a struc- 
tured questionnaire, levels of agreement and disagreement dur- 
ing the Delphi consensus process were expressed in terms of a 
nine-point Likert scale. Agreement was rated as 7 or more and 
disagreement as 3 or less. Consensus was defined as more than 
70% agreement. 

Experts on the modified Delphi consensus process were se- 
lected from among ECC members, clinical practitioners who 
had participated in the development of CPGs during the past 
1-2 yr, and experts in EBM methodology. A total of 13 people 
participated in the process. 

In the first round, the draft scoring guide developed by the 
focus group was distributed to all participants via email. Partici- 
pants were presented with a copy of the Korean version of the 
AGREE II instrument, which included the 92 draft scoring guides. 
They were asked to review the document and, using the ques- 
tionnaire, to indicate their level of agreement on a scale ranging 
from strongly disagree 1 to strongly agree 9. Participants also 
had the opportunity to provide written feedback. 

After collecting the completed survey sheets, we analyzed data 
for each scoring guide separately to determine whether consen- 
sus had been reached. Scoring guides were considered com- 
plete when consensus had been reached. In the absence of con- 
sensus in the first round, a second round was executed. In the 
second round, participants received structured Delphi question- 
naires targeted to the particular foci of this round. Participants 
reviewed the original draft scoring guides, the data from the 
first consensus round, including information about their own 
data relative to those from other participants, and the modified 
scoring guides. Participants were then asked to rate the modi- 
fied scoring guides using a nine-point Likert scale. Delphi rounds 
were repeated until consensus was reached. 

RESULTS 
Round 1 

In the first round, all 13 experts returned the survey sheets, yield- 
ing a response rate of 100%. Consensus was reached on 88 of 
the 92 scoring guides for the 23 items (Table 1). The experts par- 
ticipating in the consensus process were asked to propose mod- 
ifications to the scoring guides when they did not agree, and 
the focus group revised the scoring guides that did not achieve 
consensus during the first round in the light of these modifica- 
tions. These were then reviewed by the focus group during the 
second round (Table 2). 

Round 2 

Thirteen experts participated in the second round, and the re- 
sponse rate was 100%. Agreement was reached with regard to 
the four scoring guides that did not achieve consensus in the 
first round. The 92 scoring guides developed through the modi- 
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Table 1 . First-round agreement on each anchor point of the AGREE items. Consensus was defined as more than 70% agreement. The anchor point 3 on items 2, 5, 1 1 , and 1 2 
failed to reach consensus (bold italic font) 
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Table 2. First and revised drafts of the four scoring guidelines that failed to reach consensus in the first Delphi round 



AGREE item 


Score 


Scoring guide 


2 


First Draft of Point 3 
Revised Draft of Point 3 


If it can be inferred that this is a clinical question although the subtitle consists of keywords only. 
If the question is presented with minimal information (e.g., subtitle etc.) 


5 


First Draft of Point 3 
Revised Draft of Point 3 


When the perspective (experiences and expectations) and the preference of the target group are described or 

can be estimated based on the subjective experiences of the developer. 
Although the perspective (experiences and expectations) and the preference of the target group are described, 

the survey method is not described. 


11 


First Draft of Point 3 
Revised Draft Point 3 


When evidence or detailed data on benefits, side effects, and factors that are hazardous to health are not 

presented, and they are not sufficiently addressed in the recommendations. 
When evidence or detailed data on benefits, side effects, and factors hazardous to health are not presented, 

and only some aspects are addressed in the recommendations. 


12 


First Draft of Point 3 
Revised Draft of Point 3 


When the connection between the recommendations and the supporting evidence is insufficient. 
When only some of the recommendations are related to the supporting evidence. 



fled Delphi process are available on the website of the Korean 
Medical Guideline Information Center (http://www.guideline. 
or.kr). 

DISCUSSION 

The AGREE II evaluates the quality of CPGs, addresses how and 
what to present in published guidelines, and provides a meth- 
odology for the development of CPGs (6). It is an important tool 



for education regarding the development of CPGs; it also allows 
developers to understand the strengths and weaknesses of their 
guidelines when evaluating their own or others' guidelines and 
to update their guidelines accordingly. 

Although the history of CPGs in Korea is short, demand for 
the development of CPGs has increased. Indeed, nearly 120 
guidelines are being planned by professional member societies 
of the KAMS (10, 11). Presently, CPG developers in Korea typi- 
cally use an "adaptation process." Differences in the interpreta- 
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tions of evaluators during the process of applying the AGREE in- 
strument in evaluations of the quality of existing CPGs is among 
the most difficult problems faced by CPG developers. For ex- 
ample, an AGREE II evaluation was performed on the 10 draft 
guidelines developed with regard to the CPGs for stomach can- 
cer in Korea in 2011. Significant deviations were observed dur- 
ing this process, such as differences of 2 points or more on 6.6 ± 
3.5 (mean ± SD) items (unpublished data), which reflect prob- 
lems in the use of the AGREE instrument in a Korean situation. 
The reasons that Korean CPG developers and evaluators expe- 
rience relatively greater difficulty using the AGREE tool include 
the absence of consensus among medical professionals regard- 
ing the relative levels of importance of the elements listed in the 
"how-to-rate" guidelines in the AGREE II manual, differences 
among medical environments, socio-cultural differences be- 
tween Korea and Western nations, and subjective interpreta- 
tions of each of the questions. Although AGREE is a widely ac- 
cepted tool in the field of CPGs evaluation, it has the disadvan- 
tage that it can be affected by the subjective perceptions of eval- 
uators (12-14). In 2010, Dans and Dans (15) noted that AGREE 
II items demand that activities be "described well" rather than 
be "be performed well," which causes confusion about the pur- 
pose of the evaluation and, ultimately, about the grading. As the 
Korean medical community does not yet have sufficient experi- 
ence with CPGs, differing interpretations and understandings 
among evaluators constitute major obstacles. In Korea, almost 
every participant in the development and evaluation of CPGs is 
a medical doctor who has majored in medicine and has experi- 
ence in clinical practice. Thus, majority of evaluators considered 
the quality of expected performance of recommendations in 
addition to the quality of description in the evaluation process. 
This may be another reason for the major differences among 
evaluators using the AGREE II instrument. Thus, it is expected 
that provision of an AGREE II scoring guide will facilitate the 
achievement of consensus about the purpose of and the ap- 
proach to the development of CPGs. 

ECC attempted to provide anchor points for scores of 1, 3, 5, 
and 7 based on importance and a quantitative measure of satis- 
faction after agreeing on the standards for a seven point scale. 
These were based on the "How to Rate" instructions for the 
AGREE II instrument because these provided a good descrip- 
tion of the purpose and the content of each item. In the first 
round of the Delphi consensus process, however, we could not 
reach consensus on the anchor point 3 for AGREE II item num- 
bers 2, 5, 11, and 12 (Table 1). Thus, it was difficult to identify 
four phased anchor points based on the "How to Rate" guide- 
lines alone. In these cases, we tried to provide standards that 
considered being "be performed well." 

To identify problems, the guide was applied to several recent- 
ly developed domestic CPGs. Although no major differences 
among evaluators were observed, the data are not reported here 



because too few guidelines were involved. This scoring guide will 
be further organized and modified in the process of actually ap- 
plying it to diverse CPGs, and it will be revised to reflect further 
developments in CPGs and medical environments in Korea. 
The scoring guide for the AGREE II instrument proposed herein 
can be used to evaluate previously developed CPGs in Korea 
and is a useful foundation for the creation of new CPGs in the 
future. 
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